智能论文笔记

Diffusion Adversarial Representation Learning for Self-supervised Vessel Segmentation

Boah Kim , Yujin Oh , Jong Chul Ye

分类：计算机视觉 | 机器学习

2022-09-29

医学图像中的血管分割是诊断血管疾病和治疗计划的重要任务之一。尽管已经对基于学习的细分方法进行了广泛的研究，但在有监督的方法中需要大量的基础真实标签，并且令人困惑的背景结构使神经网络难以以无监督的方式分割血管。为了解决这个问题，在这里，我们介绍了一种新型的扩散对抗表示学习（DARL）模型，该模型利用具有对抗性学习的降解扩散概率模型，并将其应用于血管分割。特别是，对于自我监管的血管分割，Darl使用扩散模块学习背景图像分布，该模块使生成模块有效地提供了容器表示。同样，通过基于提议的可切换在空间自适应的否定规范化的对抗学习，我们的模型估计了合成的假船只图像以及船舶分割掩码，这进一步使模型捕获了辅助血管的语义信息。一旦训练了提出的模型，该模型就会生成一个步骤，并可以应用于冠状动脉血管造影和视网膜图像的一般血管结构分割。各种数据集的实验结果表明，我们的方法在船舶分割中的现有无监督和自我监督方法的表现显着胜过。

translated by 谷歌翻译

Diffusion Deformable Model for 4D Temporal Medical Image Generation

Boah Kim , Jong Chul Ye

分类：计算机视觉 | 机器学习

2022-06-27

具有3D+T（4D）信息的时间体积图像通常用于医学成像中，以统计分析时间动力学或捕获疾病进展。尽管已经对自然图像的基于深度学习的生成模型进行了广泛的研究，但时间医学图像生成（例如4D心脏量数据）的方法受到限制。在这项工作中，我们提出了一个新颖的深度学习模型，该模型在源和目标体积之间产生了中间时间的体积。具体而言，我们通过调整最近对现实图像产生的非转化扩散概率模型来提出扩散可变形模型（DDM）。我们提出的DDM由扩散和变形模块组成，因此DDM可以在源和目标量之间学习空间变形信息，并提供潜在的代码，用于沿着测量路径生成中间帧。一旦训练了我们的模型，从扩散模块估算的潜在代码将简单地插入并馈入变形模块，该模块使DDM能够沿着连续轨迹生成时间帧，同时保留源图像的拓扑。我们证明了每个受试者舒张期和收缩期之间的4D心脏MR图像产生的提议方法。与现有的变形方法相比，我们的DDM在时间体积生成上实现了高性能。

translated by 谷歌翻译

DiffuseMorph: Unsupervised Deformable Image Registration Along Continuous Trajectory Using Diffusion Models

Boah Kim , Inhwa Han , Jong Chul Ye

分类：计算机视觉 | 机器学习

2021-12-09

可变形图像配准是医学成像和计算机视觉的基本任务之一。经典登记算法通常依赖于迭代优化方法来提供准确的变形，这需要高计算成本。虽然已经开发了许多基于深度学习的方法来进行快速图像登记，但估计具有较少拓扑折叠问题的变形场仍然挑战。此外，这些方法仅使登记到单个固定图像，并且不可能在移动和固定图像之间获得连续变化的登记结果。为了解决这个问题，我们介绍了一种新的扩散模型的概率图像配准方法，称为DemageUseMorph。具体而言，我们的模型了解移动和固定图像之间变形的得分函数。类似于现有的扩散模型，DiffUsemorph不仅通过反向扩散过程提供合成变形图像，而且还使运动图像的各种水平与潜在的空间一起。在2D面部表达图像和3D脑图像登记任务上的实验结果表明，我们的方法可以通过拓扑保存能力提供灵活和准确的变形。

translated by 谷歌翻译

Federated Split Vision Transformer for COVID-19 CXR Diagnosis using Task-Agnostic Training

Sangjoon Park , Gwanghyun Kim , Jeongsol Kim , Boah Kim , Jong Chul Ye

分类：人工智能 | 计算机视觉

2021-11-02

联合学习，该学习，跨越客户的神经网络的重量，在医疗领域中获得了注意力，因为它可以在维护数据隐私的同时对分散数据的大型语料库进行培训。例如，这使得Covid-19对胸部X射线（CXR）图像进行Covid-19诊断的神经网络训练，而不会在多家医院收集患者CXR数据。遗憾的是，如果采用高度富有富有富有富有富有富有富有富有富有富有富有富有富有富有富有富有富有富有效率的网络架构，权重的交换会很快消耗网络带宽。所谓的分流学习通过将神经网络划分为客户端和服务器部分来部分解决此问题，使得网络的客户端占用较少的广泛计算资源和带宽。但是，目前尚不清楚如何在不牺牲整体网络性能的情况下找到最佳分裂。为了合并这些方法，从而最大限度地提高了它们的不同优势，这里我们表明视觉变压器，最近开发的具有直接可分解配置的深度学习架构，理想地适合分裂学习而不会牺牲性能。即使在使用来自多个来源的CXR数据集之间模拟医院之间实际协作的非独立性和相同的数据分布，也能够实现与数据集中培训相当的性能。此外，提出的框架以及异构多任务客户端还改善了包括Covid-19诊断的单独任务性能，消除了与无数参数共享大权重的需求。我们的业绩肯定了变压器在医学成像中的协作学习的适用性，并为未来的现实界限铺平了前进的方式。

translated by 谷歌翻译

Class-Continuous Conditional Generative Neural Radiance Field

Jiwook Kim , Minhyeok Lee

分类：计算机视觉 | 人工智能

2023-01-03

The 3D-aware image synthesis focuses on conserving spatial consistency besides generating high-resolution images with fine details. Recently, Neural Radiance Field (NeRF) has been introduced for synthesizing novel views with low computational cost and superior performance. While several works investigate a generative NeRF and show remarkable achievement, they cannot handle conditional and continuous feature manipulation in the generation procedure. In this work, we introduce a novel model, called Class-Continuous Conditional Generative NeRF ($\text{C}^{3}$G-NeRF), which can synthesize conditionally manipulated photorealistic 3D-consistent images by projecting conditional features to the generator and the discriminator. The proposed $\text{C}^{3}$G-NeRF is evaluated with three image datasets, AFHQ, CelebA, and Cars. As a result, our model shows strong 3D-consistency with fine details and smooth interpolation in conditional feature manipulation. For instance, $\text{C}^{3}$G-NeRF exhibits a Fr\'echet Inception Distance (FID) of 7.64 in 3D-aware face image synthesis with a $\text{128}^{2}$ resolution. Additionally, we provide FIDs of generated 3D-aware images of each class of the datasets as it is possible to synthesize class-conditional images with $\text{C}^{3}$G-NeRF.

translated by 谷歌翻译

A contrastive learning approach for individual re-identification in a wild fish population

Ørjan Langøy Olsen , Tonje Knutsen Sørdalen , Morten Goodwin , Ketil Malde , Kristian Muri Knausgård , Kim Tallaksen Halvorsen

分类：计算机视觉 | 人工智能 | 机器学习

2023-01-02

In both terrestrial and marine ecology, physical tagging is a frequently used method to study population dynamics and behavior. However, such tagging techniques are increasingly being replaced by individual re-identification using image analysis. This paper introduces a contrastive learning-based model for identifying individuals. The model uses the first parts of the Inception v3 network, supported by a projection head, and we use contrastive learning to find similar or dissimilar image pairs from a collection of uniform photographs. We apply this technique for corkwing wrasse, Symphodus melops, an ecologically and commercially important fish species. Photos are taken during repeated catches of the same individuals from a wild population, where the intervals between individual sightings might range from a few days to several years. Our model achieves a one-shot accuracy of 0.35, a 5-shot accuracy of 0.56, and a 100-shot accuracy of 0.88, on our dataset.

translated by 谷歌翻译

Learning to Maximize Mutual Information for Dynamic Feature Selection

Ian Covert , Wei Qiu , Mingyu Lu , Nayoon Kim , Nathan White , Su-In Lee

分类：机器学习 | (统计)机器学习

2023-01-02

Feature selection helps reduce data acquisition costs in ML, but the standard approach is to train models with static feature subsets. Here, we consider the dynamic feature selection (DFS) problem where a model sequentially queries features based on the presently available information. DFS is often addressed with reinforcement learning (RL), but we explore a simpler approach of greedily selecting features based on their conditional mutual information. This method is theoretically appealing but requires oracle access to the data distribution, so we develop a learning approach based on amortized optimization. The proposed method is shown to recover the greedy policy when trained to optimality and outperforms numerous existing feature selection methods in our experiments, thus validating it as a simple but powerful approach for this problem.

translated by 谷歌翻译

Design, Modeling, and Evaluation of Separable Tendon-Driven Robotic Manipulator with Long, Passive, Flexible Proximal Section

Christian DeBuys , Florin C. Ghesu , Jagadeesan Jayender , Reza Langari , Young-Ho Kim

分类：机器人

2023-01-01

The purpose of this work was to tackle practical issues which arise when using a tendon-driven robotic manipulator with a long, passive, flexible proximal section in medical applications. A separable robot which overcomes difficulties in actuation and sterilization is introduced, in which the body containing the electronics is reusable and the remainder is disposable. A control input which resolves the redundancy in the kinematics and a physical interpretation of this redundancy are provided. The effect of a static change in the proximal section angle on bending angle error was explored under four testing conditions for a sinusoidal input. Bending angle error increased for increasing proximal section angle for all testing conditions with an average error reduction of 41.48% for retension, 4.28% for hysteresis, and 52.35% for re-tension + hysteresis compensation relative to the baseline case. Two major sources of error in tracking the bending angle were identified: time delay from hysteresis and DC offset from the proximal section angle. Examination of these error sources revealed that the simple hysteresis compensation was most effective for removing time delay and re-tension compensation for removing DC offset, which was the primary source of increasing error. The re-tension compensation was also tested for dynamic changes in the proximal section and reduced error in the final configuration of the tip by 89.14% relative to the baseline case.

translated by 谷歌翻译

Situation-Aware Deep Reinforcement Learning for Autonomous Nonlinear Mobility Control in Cyber-Physical Loitering Munition Systems

Hyunsoo Lee , Soohyun Park , Won Joon Yun , Soyi Jung , Joongheon Kim

分类：机器人

2022-12-31

According to the rapid development of drone technologies, drones are widely used in many applications including military domains. In this paper, a novel situation-aware DRL- based autonomous nonlinear drone mobility control algorithm in cyber-physical loitering munition applications. On the battlefield, the design of DRL-based autonomous control algorithm is not straightforward because real-world data gathering is generally not available. Therefore, the approach in this paper is that cyber-physical virtual environment is constructed with Unity environment. Based on the virtual cyber-physical battlefield scenarios, a DRL-based automated nonlinear drone mobility control algorithm can be designed, evaluated, and visualized. Moreover, many obstacles exist which is harmful for linear trajectory control in real-world battlefield scenarios. Thus, our proposed autonomous nonlinear drone mobility control algorithm utilizes situation-aware components those are implemented with a Raycast function in Unity virtual scenarios. Based on the gathered situation-aware information, the drone can autonomously and nonlinearly adjust its trajectory during flight. Therefore, this approach is obviously beneficial for avoiding obstacles in obstacle-deployed battlefields. Our visualization-based performance evaluation shows that the proposed algorithm is superior from the other linear mobility control algorithms.

translated by 谷歌翻译

X-MAS: Extremely Large-Scale Multi-Modal Sensor Dataset for Outdoor Surveillance in Real Environments

DongKi Noh , Changki Sung , Teayoung Uhm , WooJu Lee , Hyungtae Lim , Jaeseok Choi , Kyuewang Lee , Dasol Hong , Daeho Um , Inseop Chung

分类：机器人

2022-12-30

In robotics and computer vision communities, extensive studies have been widely conducted regarding surveillance tasks, including human detection, tracking, and motion recognition with a camera. Additionally, deep learning algorithms are widely utilized in the aforementioned tasks as in other computer vision tasks. Existing public datasets are insufficient to develop learning-based methods that handle various surveillance for outdoor and extreme situations such as harsh weather and low illuminance conditions. Therefore, we introduce a new large-scale outdoor surveillance dataset named eXtremely large-scale Multi-modAl Sensor dataset (X-MAS) containing more than 500,000 image pairs and the first-person view data annotated by well-trained annotators. Moreover, a single pair contains multi-modal data (e.g. an IR image, an RGB image, a thermal image, a depth image, and a LiDAR scan). This is the first large-scale first-person view outdoor multi-modal dataset focusing on surveillance tasks to the best of our knowledge. We present an overview of the proposed dataset with statistics and present methods of exploiting our dataset with deep learning-based algorithms. The latest information on the dataset and our study are available at https://github.com/lge-robot-navi, and the dataset will be available for download through a server.

translated by 谷歌翻译